Enhancing Taxonomies by Providing Many Paths
نویسندگان
چکیده
A taxonomy organizes concepts or topics in a hierarchical structure and can be created manually or via automated systems. One major drawback of taxonomies is that they require users to have the same view of the topics as the taxonomy creator. That is, when a user follows a top-down path to find the specific topic of her interest, she has to make choices along the constrained sequence that is present in the hierarchy. As a result, users who do not share that mental taxonomy are likely to have additional difficulties in finding the desired topic. Although this problem can be reduced by remedies like cross-topic links, such approaches break the hierarchical structure and greatly increase human editing cost. Faceted search/browsing has also been proposed to address this problem; however, identifying facets in large scale datasets is a significant challenge. In this paper, we propose a new approach to taxonomy expansion which is able to provide more flexible views. Based on an existing taxonomy, our algorithm finds possible alternative paths and generates a new, expanded taxonomy with flexibility in user browsing choices. In experiments on the dmoz Open Directory Project, the rebuilt taxonomies show favorable characteristics (more alternative paths and shorter paths to information). User studies show that our expanded taxonomies are preferred compared to the original.
منابع مشابه
A comparative analysis of the evolution of a taxonomy for best practices: a case for 'knowledge efficiency'
Taxonomies play an increasingly important role in knowledge management of business best practices, providing a basis by which to index, find and communicate knowledge. However, knowledge continues to evolve over time. As a result, taxonomies must also continue to evolve as organizations innovate and change. Reportedly, firms customize best-practice taxonomies to meet their unique organization n...
متن کاملRapid Induction of Multiple Taxonomies for Enhanced Faceted Text Browsing
In this paper we present and compare two methodologies for rapidly inducing multiple subject-specific taxonomies from crawled data. The first method involves a sentence-level words co-occurrence frequency method for building the taxonomy, while the second involves the bootstrapping of a Word2Vec based algorithm with a directed crawler. We exploit the multilingual open-content directory of the W...
متن کاملLearning Naive Bayes Classifiers From Attribute Value Taxonomies and Partially Specified Data
Partially specified data are commonplace in many practical applications of machine learning where different instances are described at different levels of precision relative to an attribute value taxonomy (AVT). This paper describes AVT-NBL – a variant of the Naïve Bayes Learning algorithm that effectively exploits user-supplied attribute value taxonomies to construct compact and accurate Naïve...
متن کاملLearning Naı̈ve Bayes Classifiers From Attribute Value Taxonomies and Partially Specified Data
Partially specified data are commonplace in many practical applications of machine learning where different instances are described at different levels of precision relative to an attribute value taxonomy (AVT). This paper describes AVTNBL an extension of the Naı̈ve Bayes Learning algorithm that effectively exploits user-supplied attribute value taxonomies to construct compact and accurate Naı̈ve...
متن کاملLearning to integrate web taxonomies
We investigate machine learning methods for automatically integrating objects from different taxonomies into a master taxonomy. This problem is not only currently pervasive on the Web, but is also important to the emerging Semantic Web. A straightforward approach to automating this process would be to build classifiers through machine learning and then use these classifiers to classify objects ...
متن کامل